Python Job: Sr. Data Engineer (AWS, Python) - Hybrid (2 days o

Job added on

Company

Recru, LLC

Location

Houston, Texas - United States of America

Job type

Full-Time

Python Job Details

We're looking for a Senior AWS Data Engineer skilled in Python open to working hybrid in Houston! This role is contract-to-hire.

Purpose

  • Establish data feeds to the digital platform and manage ETL processes. Design raw file data structures to be used in data science applications. Associated commercial data includes data required for forecasting, analysis, optimization modeling and data visualization.
  • Work with Data Analysts to ensure enterprise data is cataloged in accordance with data governance standards.
  • Develop system test and integration plans, execute test procedures, support user community, and obtain approvals for change management.
  • Provide hands-on technical configuration, application development, integration and testing. Implement cloud data integration strategies between cloud providers.
  • Provide technical expertise to the business and internal IT with respect to digital solutions, including data visualization and facilitate business expansion with solutions to scale, managing the critical input and output data.
  • Design, develop, and maintain data solutions that leverage the power and flexibility of AWS

Essential Job Functions

  • Developing, testing, and deploying data pipelines, data lakes, data warehouses, and data marts using AWS services such as S3, Glue, Athena, Redshift, EMR, Kinesis, Lambda, and more.
  • Implementing data ingestion, processing, transformation, and delivery using various methods such as batch, streaming, and real-time.
  • Applying best practices for data quality, security, reliability, and performance using AWS services such as Lake Formation, IAM, KMS, CloudTrail, CloudFormation, and more.
  • Collaborating with data architects, data analysts, data scientists, citizen data scientists, and developers to understand data requirements and provide data solutions that meet business needs and follow the AWS Well-Architected Framework.
  • Evaluating and adopting new AWS technologies and features to improve our data capabilities and optimize our data costs.
  • Troubleshooting and resolving data issues and ensuring data availability and integrity.
  • Documenting and maintaining data solutions and data standards.
  • Using Dataiku DSS for doing pipeline workflows and supporting citizen data scientist.
  • Designing and building web applications, REST APIs, and GraphQL APIs using AWS services such as API Gateway, Lambda, DynamoDB, Flask, and more.
  • Applying software engineering rigor and best practices to machine learning, including AI/MLOps, CI/CD, automation, etc.
  • Facilitating the development and deployment of proof-of-concept machine learning systems using AWS services such as SageMaker, Dataiku, and more.
  • Developing and deploying scalable tools and services for our clients to handle machine learning training and inference using AWS services such as SageMaker, ECR, EKS, ECS, and more.
  • Architect digital data solutions for GEM US for all Commercial Operations (Trading & Assets).
  • Implement ETL for required enterprise data.
  • Work with other IT operations support resources in a collaborative manner to achieve cross-functional goals.
  • Identify deficiencies and recommend solutions to management and the user community.
  • Provide automation where applicable to minimize business impact and ongoing IT support.
  • Continual performance assessment of processes and applications.
  • Oversee and mentor key support resources, training them to provide primary and backup system operations and maintenance functions.
  • Provide automation of asset related data imports where needed and provide complex reporting solutions.
  • Effectively communicate with users, peers and management.
  • Leverage people, processes and technologies from GEM Europe and ENGIE DIGITAL teams.
  • Work closely with the new Data team as it is setup in GEM US and monitor critical work in Collibra.
  • Assist in release management planning and execution.
  • Work with vendors such as Dataiku and Databricks to ensure reliable and secure operations on AWS (Common Data Hub) and other cloud solution providers.
  • Provide input to program managers of large projects and present issues for team resolution when appropriate.
  • Coordinate with other application teams for data integration, such as ETRM systems, ISO Market Systems, SCADA and data warehouses.
  • Develop and implement data integration and ETL processes, including data quality checks, error handling, and performance optimization.
  • Work with data scientists, business analysts, and other stakeholders to understand data requirements and design data solutions that meet their needs.
  • Develop and maintain data models, metadata, data lineage documentation, and associated project documentation.
  • Troubleshoot and resolve data pipeline and processing issues in a timely manner.
  • Participate in code reviews, design discussions, and agile ceremonies.
  • Keep up-to-date with the latest AWS technologies, data engineering best practices, and industry trends.
  • Assist in training other IT team members and business users where needed, including development processes on the digital platform.

Requirements

  • Knowledge of energy market data.
  • In-depth knowledge of digital platforms such as AWS, Dataiku and Databricks.
  • In-depth proven software engineering experience using Python and machine learning knowledge.
  • Designing and building scalable data architectures on AWS, with strong experience using AWS data storage and processing services such as S3, EMR, Data Lake Formation, AWS Glue, and Amazon Athena.
  • Experience in working with DevOps tools and methodologies such as Git, Jenkins, Docker, or Kubernetes.
  • Proficiency in programming languages such as Python, Java, Scala, SQL, Spark, and Unix shell scripting, including AWS CLI.
  • Automating data processing workflows using AWS services, AWS CLI, APIs, and scripting languages like Python, Java, and/or Scala.
  • Experience with SOAP/RESTful and GraphQL APIs using AWS services and frameworks and familiarity with other cloud services such as Snowflake.
  • Knowledge of ETL tools and frameworks such as Apache Spark, Airflow, AWS IoT Analytics, IoT Device Management, and AWS IoT Events.
  • Familiarity with database technologies such as SQL, NoSQL, and data warehousing concepts. Working knowledge of Amazon Aurora, DynamoDB, and Amazon RDS.
  • Understanding of data security, privacy, and compliance requirements.
  • Experience with data modeling, metadata management, and data lineage tracking.
  • Strong analytical, problem-solving, and communication skills.
  • Ability to work in a team environment and collaborate with cross-functional teams.
  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
  • Current data visualization tool experience such as Power BI Service and Desktop.
  • Prior design and implementation experience with data warehouses.
  • Expert knowledge of SQL databases, SQL queries, Amazon Athena, SQL Integration Services, SQL Reporting Services and data formats such as XML and JSON.
  • Knowledge of ISO s and major market activities preferred.

Education/Experience

  • Bachelor s degree in a technical field or 15+ software development experience in lieu of degree.
  • 10-15years of information technology, systems, and application development experience.
  • Minimum 5 years experience supporting the needs of an Energy Trading or Asset Management is business preferred
  • Proficient with Microsoft Office products (Excel, Word, Visio, PowerPoint)
  • Proficient with Windows and Linux operating systems.
  • ISO Market systems knowledge preferred.
  • Modern cloud technologies used for managing data (AWS is highly preferred).